54 research outputs found
Learning New Classes from Limited Data in Image Segmentation and Object Detection
L'abstract è presente nell'allegato / the abstract is in the attachmen
The RGB-D Triathlon: Towards Agile Visual Toolboxes for Robots
Deep networks have brought significant advances in robot perception, enabling
to improve the capabilities of robots in several visual tasks, ranging from
object detection and recognition to pose estimation, semantic scene
segmentation and many others. Still, most approaches typically address visual
tasks in isolation, resulting in overspecialized models which achieve strong
performances in specific applications but work poorly in other (often related)
tasks. This is clearly sub-optimal for a robot which is often required to
perform simultaneously multiple visual recognition tasks in order to properly
act and interact with the environment. This problem is exacerbated by the
limited computational and memory resources typically available onboard to a
robotic platform. The problem of learning flexible models which can handle
multiple tasks in a lightweight manner has recently gained attention in the
computer vision community and benchmarks supporting this research have been
proposed. In this work we study this problem in the robot vision context,
proposing a new benchmark, the RGB-D Triathlon, and evaluating state of the art
algorithms in this novel challenging scenario. We also define a new evaluation
protocol, better suited to the robot vision setting. Results shed light on the
strengths and weaknesses of existing approaches and on open issues, suggesting
directions for future research.Comment: This work has been submitted to IROS/RAL 201
CoMFormer: Continual Learning in Semantic and Panoptic Segmentation
Continual learning for segmentation has recently seen increasing interest.
However, all previous works focus on narrow semantic segmentation and disregard
panoptic segmentation, an important task with real-world impacts. %a In this
paper, we present the first continual learning model capable of operating on
both semantic and panoptic segmentation. Inspired by recent transformer
approaches that consider segmentation as a mask-classification problem, we
design CoMFormer. Our method carefully exploits the properties of transformer
architectures to learn new classes over time. Specifically, we propose a novel
adaptive distillation loss along with a mask-based pseudo-labeling technique to
effectively prevent forgetting. To evaluate our approach, we introduce a novel
continual panoptic segmentation benchmark on the challenging ADE20K dataset.
Our CoMFormer outperforms all the existing baselines by forgetting less old
classes but also learning more effectively new classes. In addition, we also
report an extensive evaluation in the large-scale continual semantic
segmentation scenario showing that CoMFormer also significantly outperforms
state-of-the-art methods.Comment: Under submissio
Boosting Deep Open World Recognition by Clustering
While convolutional neural networks have brought significant advances in
robot vision, their ability is often limited to closed world scenarios, where
the number of semantic concepts to be recognized is determined by the available
training set. Since it is practically impossible to capture all possible
semantic concepts present in the real world in a single training set, we need
to break the closed world assumption, equipping our robot with the capability
to act in an open world. To provide such ability, a robot vision system should
be able to (i) identify whether an instance does not belong to the set of known
categories (i.e. open set recognition), and (ii) extend its knowledge to learn
new classes over time (i.e. incremental learning). In this work, we show how we
can boost the performance of deep open world recognition algorithms by means of
a new loss formulation enforcing a global to local clustering of class-specific
features. In particular, a first loss term, i.e. global clustering, forces the
network to map samples closer to the class centroid they belong to while the
second one, local clustering, shapes the representation space in such a way
that samples of the same class get closer in the representation space while
pushing away neighbours belonging to other classes. Moreover, we propose a
strategy to learn class-specific rejection thresholds, instead of heuristically
estimating a single global threshold, as in previous works. Experiments on
RGB-D Object and Core50 datasets show the effectiveness of our approach.Comment: IROS/RAL 202
On the Challenges of Open World Recognitionunder Shifting Visual Domains
Robotic visual systems operating in the wild must act in unconstrained
scenarios, under different environmental conditions while facing a variety of
semantic concepts, including unknown ones. To this end, recent works tried to
empower visual object recognition methods with the capability to i) detect
unseen concepts and ii) extended their knowledge over time, as images of new
semantic classes arrive. This setting, called Open World Recognition (OWR), has
the goal to produce systems capable of breaking the semantic limits present in
the initial training set. However, this training set imposes to the system not
only its own semantic limits, but also environmental ones, due to its bias
toward certain acquisition conditions that do not necessarily reflect the high
variability of the real-world. This discrepancy between training and test
distribution is called domain-shift. This work investigates whether OWR
algorithms are effective under domain-shift, presenting the first benchmark
setup for assessing fairly the performances of OWR algorithms, with and without
domain-shift. We then use this benchmark to conduct analyses in various
scenarios, showing how existing OWR algorithms indeed suffer a severe
performance degradation when train and test distributions differ. Our analysis
shows that this degradation is only slightly mitigated by coupling OWR with
domain generalization techniques, indicating that the mere plug-and-play of
existing algorithms is not enough to recognize new and unknown categories in
unseen domains. Our results clearly point toward open issues and future
research directions, that need to be investigated for building robot visual
systems able to function reliably under these challenging yet very real
conditions. Code available at
https://github.com/DarioFontanel/OWR-VisualDomainsComment: RAL/ICRA 202
Mask2Anomaly: Mask Transformer for Universal Open-set Segmentation
Segmenting unknown or anomalous object instances is a critical task in
autonomous driving applications, and it is approached traditionally as a
per-pixel classification problem. However, reasoning individually about each
pixel without considering their contextual semantics results in high
uncertainty around the objects' boundaries and numerous false positives. We
propose a paradigm change by shifting from a per-pixel classification to a mask
classification. Our mask-based method, Mask2Anomaly, demonstrates the
feasibility of integrating a mask-classification architecture to jointly
address anomaly segmentation, open-set semantic segmentation, and open-set
panoptic segmentation. Mask2Anomaly includes several technical novelties that
are designed to improve the detection of anomalies/unknown objects: i) a global
masked attention module to focus individually on the foreground and background
regions; ii) a mask contrastive learning that maximizes the margin between an
anomaly and known classes; iii) a mask refinement solution to reduce false
positives; and iv) a novel approach to mine unknown instances based on the
mask-architecture properties. By comprehensive qualitative and qualitative
evaluation, we show Mask2Anomaly achieves new state-of-the-art results across
the benchmarks of anomaly segmentation, open-set semantic segmentation, and
open-set panoptic segmentation.Comment: 16 pages. arXiv admin note: substantial text overlap with
arXiv:2307.1331
Hydrogen jet-fire: Accident investigation and implementation of safety measures for the design of a downstream oil plant
As amply known, hydrogen plays a very significant role in the process industry exerting a vital functionality in oil refineries, namely for secondary level refining units such hydro-treating and hydrocracking sections. This paper starts from a statistical analysis on hydrogen accidents and a thorough investigation on the sequence and causes of an accident involving a hydrogen leakage in a downstream oil industry. We present some key features of the accident and comment some practical implications for setting up risk reduction options at the plant level. The applicative phase of the paper states the main prevention strategies and suggest possible mitigation measures for hydrogen leaks events, discussing some practical solutions applied in the design of a large refinery. The experience and lessons learned gained from the event investigation and the comparison of the accident with the predictions of the safety report leads to the formulation of proposals and design modifications aiming at preventing or at least minimizing the consequences
- …